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Children are notoriously bad at delaying gratification to achieve later, greater rewards (e.g., 
Piaget, 1970)—and some are worse at waiting than others. Individual differences in the 
ability-to-wait have been attributed to self-control, in part because of evidence that 
long-delayers are more successful in later life (e.g., Shoda, Mischel, & Peake, 1990). Here 
we provide evidence that, in addition to self-control, children’s wait-times are modulated 
by an implicit, rational decision-making process that considers environmental reliability. 
We tested children (M = 4;6, N = 28) using a classic paradigm—the marshmallow task (Mis- 
chel, 1974)—in an environment demonstrated to be either unreliable or reliable. Children in 
the reliable condition waited significantly longer than those in the unreliable condition 
(p< 0.0005), suggesting that children’s wait-times reflected reasoned beliefs about 
whether waiting would ultimately pay off. Thus, wait-times on sustained delay-of-gratifi- 
cation tasks (e.g., the marshmallow task) may not only reflect differences in self-control 
abilities, but also beliefs about the stability of the world. 

© 2012 Elsevier B.V. All rights reserved. 
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1. Introduction Perfors, 2009), and word meanings (e.g., Xu & Tenenbaum, 


2007). Here we demonstrate that young children also use 


When children draw on walls, reject daily baths, or 
leave the house wearing no pants and a tutu, caretakers 
may reasonably doubt their capacity for rational deci- 
sion-making. However, recent evidence suggests that even 
very young children possess sophisticated decision-mak- 
ing capabilities for reasoning about physical causality 
(e.g., Gopnik et al., 2004; Gweon & Schulz, 2011), social 
behavior (e.g., Gergely, Bekkering, & Kiraly, 2002), future 
events (e.g., Denison & Xu, 2010; Kidd, Piantadosi, & Aslin, 
2012; Téglas et al., 2011), concepts and categories (e.g., 
Piantadosi, Tenenbaum, & Goodman, 2012; Xu, Dewar, & 
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their rational decision-making abilities in a domain of 
behavioral inhibition: a sustained delay-of-gratification 
task. 

Decision-making is said to be rational if it maximizes 
benefit or utility (Anderson, 1991; Anderson & Milson, 
1989; Marr, 1982), yet young children’s decisions during 
delay-of-gratification tasks often appear to do just the 
opposite (e.g., Mischel & Ebbesen, 1970). When asked to 
resist the temptation of an immediately available low- 
value reward to obtain one of high-value after a temporal 
delay, 75% of children failed to do so, succumbing to their 
desire after an average of 5.72 min. The cause of these 
apparent failures of rationality, however, is not fully 
understood. While children’s failures to wait are likely 
the result of a combination of many genetic and environ- 
mental variables, two potentially important factors are 
self-control capacity and established beliefs. 
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1.1. Deficient capacity hypothesis 


One possible explanation for failing to wait for a larger 
reward is a deficiency in self-control; some children are 
simply incapable of inhibiting their immediate-response 
tendency to seek gratification. Young infants, for example, 
have not yet developed the executive functions necessary 
for inhibitory control (e.g., Piaget, 1970), as evidenced by 
the perseveration errors made by up to 2-year-old children 
in A-Not-B tasks (e.g., Marcovitch & Zelazo, 1999; Piaget, 
1954). As predicted by this theory, children’s ability to de- 
lay gratification improves with maturation (e.g., Mischel & 
Metzner, 1962). Maturational changes, however, are insuf- 
ficient to account for all of the variance in task perfor- 
mance (e.g. Romer, Duckworth, Sznitzman, & Park, 
2010). Individual differences in children’s capacities for 
self-control may account for the remaining variance. 

Self-control has been implicated as a major causal fac- 
tor in a child’s later life successes (or failures). Mischel, 
Shoda, and Peake (1988) analyzed data from adolescents 
who, many years earlier, had been presented with a labo- 
ratory choice-task: eat a single marshmallow immediately, 
or resist the temptation during a sustained delay to receive 
two marshmallows. With no means of distracting them- 
selves from a treat left in view, the majority of children 
failed to wait for the maximum delay (15 or 20 min) before 
eating the marshmallow, with a mean wait-time of 6 min 
and 5s. Importantly, longer wait-times among children 
were correlated with greater self-confidence and better 
interpersonal skills, according to parental report. Longer 
wait-times also correlated with higher SAT scores (Shoda 
et al., 1990), less likelihood of substance abuse (Ayduk 
et al., 2000), and many other positive life outcomes (e.g., 
Mischel, Shoda, & Rodriguez, 1989). Based on these find- 
ings, the marshmallow task was argued to be a powerful 
diagnostic tool for predicting personal well-being and la- 
ter-life achievement—“an early indicator of an apparently 
long-term personal quality” (Mischel et al., 1988). The lo- 
gic of the claim is that a child who possesses more self- 
control can resist fleeting temptations to pursue difficult 
goals; in contrast, children with less self-control fail to per- 
sist toward these goals and thus achieve less. To be clear, 
the evidence for poor self-control in young children (e.g., 
Baumeister, Heatherton, & Tice, 1994; Goleman, 1995), in 
a wide variety of tasks and contexts, is undeniable. At issue 
is the origin of failure of delay-of-gratification in laboratory 
tests like the marshmallow task, which has remained lar- 
gely speculative (Mischel et al., 1989, p. 936). 


1.2. Rational decision-making hypothesis 


Another possibility is that the variance in children’s 
performance may be due to differences in children’s expec- 
tations and beliefs (Mahrer, 1956; Mischel, 1961; Mischel 
& Staub, 1965). Under this theory, children engage in ra- 
tional decision-making about whether to wait for the sec- 
ond marshmallow. This implicit process of making rational 
decisions is based upon beliefs that the child acquired be- 
fore entering the testing room. The basis for this theory 
centers on what it means to be rational in the context of 
the marshmallow task. Waiting is only the rational choice 


if you believe that a second marshmallow is likely to actu- 
ally appear after a reasonably short delay—and that the 
marshmallow currently in your possession is not at risk 
of being taken away. This presumption may not apply 
equally to all children. Consider the mindset of a 4-year- 
old living in a crowded shelter, surrounded by older chil- 
dren with little adult supervision. For a child accustomed 
to stolen possessions and broken promises, the only guar- 
anteed treats are the ones you have already swallowed. At 
the other extreme, consider the mindset of an only-child in 
a stable home whose parents reliably promise and deliver 
small motivational treats for good behavior. From this 
child’s perspective, the rare injustice of a stolen object or 
broken promise may be so startlingly unfamiliar that it 
prompts an outburst of tears. The critical point of the fore- 
going vignette is that rational behavior is inferred by 
understanding the goals and expectations of the agent 
(Anderson, 1991; Anderson & Milson, 1989; Marr, 1982). 
Relevant to this hypothesis is the fact that children with 
absent fathers more often prefer immediate, lesser rewards 
over delayed, more valuable ones (Mischel, 1961). Also, 
children’s willingness to wait is negatively impacted by 
uncertainty about the likelihood, value, or temporal avail- 
ability of the future reward (Fawcett, McNamara, & 
Houston, 2012; Mahrer, 1956; McGuire & Kable, 2012; 
Mischel, 1974; Lowenstein, Read, & Baumeister, 2003). 
These effects are consistent with the idea that children 
may be capable of engaging in a rational process when 
deciding whether or not to wait. 

In support of this second hypothesis, we present evi- 
dence that the reliability of the experimenter in the testing 
environment influences children’s wait-times during the 
marshmallow task. Half of the children observed evidence 
that the researcher was reliable in advance of the marsh- 
mallow task, while half observed evidence that she was 
unreliable. If children employ a rational process in deciding 
whether or not to eat the first marshmallow, we expect 
children in the reliable condition to be significantly more 
likely to wait than those in the unreliable condition. Our 
experiment provides a fundamental test of this perspective 
on children’s rational behavior and provides compelling 
evidence that young children are indeed capable of delay- 
ing gratification in the face of temptation when provided 
with evidence that waiting will pay off. 


2. Materials and methods 
2.1. Participants 


Twenty-eight caretakers volunteered their children 
(ages 3;6 - 5;10) for the study. The children were all 
healthy, had not recently visited the lab (within 2 months), 
and had not interacted with researchers running the study 
since infancy. These precautions ensured children had 
minimal prior expectations specific to the lab or research- 
er’s reliability before this study. Children were randomly 
assigned to one of two experimental conditions—unreliable 
and reliable—such that each group was gender and age bal- 
anced (nine males, five females, and M = 4;6). Participants 
received a small treat bag and $10 as compensation. 
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2.2. Procedure 


2.2.1. Art project task 

Before the marshmallow task, children were first pro- 
vided with evidence about the reliability of the researcher 
through the completion of a two-part art project involving 
a Create-Your-Own-Cup kit (with which children could 
decorate a blank paper slip to be inserted into a special 
cup). Each of the project’s two parts involved a crucial 
choice. In Choice 1, the child could either use well-used 
crayons or wait for a new set of art supplies. In Choice 2, 
the child could either use one small sticker or wait for a 
new set of better stickers. Upon arrival, children were es- 
corted to the “art project room” that was not part of the 
normal lab space and where parents could covertly observe 
them from the main lab space. 

For Choice 1, the researcher presented the child with a 
small set of well-used crayons in a tightly sealed wide- 
mouth jar. The researcher explained that the child could 
use the crayons now, or wait until the researcher returned 
with a brand-new set of exciting art supplies to use in- 
stead. The researcher then placed the tightly sealed crayon 
jar in the center of the table and left the child alone in the 
room to wait for 2.5 min. Though we wanted children to 
ostensibly have a choice, we wanted them to choose to 
wait. Thus, the chosen container was intentionally difficult 
to open. This manipulation was successful, and all children 
waited the full 2.5 min without using the well-used cray- 
ons. In the unreliable condition, the researcher returned 
without the promised art set and provided the following 
explanation: “I’m sorry, but I made a mistake. We don’t have 
any other art supplies after all. But why don’t you just use 
these instead?” The researcher then helped the child open 
the jar of well-used crayons. In the reliable condition, the 
researcher returned with a rotating tray featuring a large 
assortment of exciting art supplies. (See Appendix A.1 for 
full scripted dialog.) In both conditions, the researcher 
encouraged the child to draw for 2 min. 

For Choice 2, the researcher produced a round 1/4-in. 
reward-style sticker from their pocket sealed inside of a 
plastic envelope. The researcher explained that they could 
use the small sticker now, or wait until the researcher re- 
turned with a larger number of better stickers to use in- 
stead. The researcher then placed the small sealed sticker 
in the center of the table and left the child alone in the 
room to wait for 2.5 min. As in Choice 1, the sticker pack- 
aging was also difficult-to-open by design: the sticker was 
glued down and covertly sealed inside the plastic envelope 
with superglue. This preparation was ultimately unneces- 
sary, however, as children were so occupied with drawing 
during this delay that they did not examine the sticker. 
This manipulation was also successful, and all children 
waited the full 2.5 min. without using the 1/4-in. reward- 
style sticker. In the unreliable condition, the researcher re- 
turned without the promised stickers and provided the fol- 
lowing explanation: “/’m sorry, but I made a mistake. We 
don’t have any other stickers after all. But why don’t you just 
use this one instead?” The researcher then offered assis- 
tance to the child in opening the sealed sticker package, 
and then covertly swapped it out for an identical usable 
version. In the reliable condition, the researcher returned 


with 5-7 large die-cut stickers featuring a desirable theme 
(e.g., Toy Story, Disney Princesses). Unbeknownst to the 
child, the caretaker selected that set of stickers to be espe- 
cially desirable in advance of the study. In both conditions, 
the researcher then encouraged the child to work on their 
drawing for 2 min. 

Thus, children were provided with two sources of evi- 
dence that the experimenter—and more generally the test- 
ing situation—was either unreliable or reliable. 


2.2.2. Marshmallow task 

The marshmallow task immediately followed the two- 
part art task. Once the table was cleared, the researcher re- 
vealed a single marshmallow to the child and provided the 
following explanation: 


“You finished just in time, because now it’s snack time! 
You have a choice for your snack. You can eat this one 
marshmallow right now. Or—if you can wait for me to go 
get more marshmallows from the other room—you can 
have two marshmallows to eat instead. How does that 
sound? [Response.] Okay, I’m going to go get more marsh- 
mallows from the other room and turn your picture into a 
cup! You should stay right here in that chair. Can you do 
that? [Response.] I'll leave this [marshmallow] here, and 
if you haven’t eaten it when I come back, you can have 
two marshmallows instead!” 


The researcher placed the marshmallow directly in 
front of the child, 4 in. from the table’s edge. The research- 
er then quickly collected the art materials and drawing and 
exited the room. The child was left alone in the room, while 
under covert observation via webcam, until either they 
consumed the marshmallow or until 15 min had elapsed. 
Regardless of whether they waited, each child was ulti- 
mately given three additional marshmallows at the conclu- 
sion of the study. 

We note that this final portion of the experimental pro- 
cedure is slightly different from those used by the studies 
analyzed in Shoda et al. (1990). Major features of the delay 
situation are identical; however we did not require chil- 
dren to explicitly signal their desire to stop waiting before 
eating the lesser treat. The original paradigms involved 
training children to expect that the experimenter would 
return upon use of an explicit signal (e.g., ringing a bell). 
Since this would necessarily provide children with addi- 
tional information about the experimenter’s reliability (as 
well as add time and complication to our already lengthy 
experimental procedure), we omitted it. As an additional 
benefit, this simplified procedure ensures that even very 
young children could quickly and easily understand the 
task. 


2.2.3. Coding 

Two naive coders (who were unaware of the experi- 
mental conditions) reviewed blinded videos of children in 
the marshmallow task and recorded when each child’s first 
taste—a lick or bite—occurred. Judgments were checked 
against one another to ensure reliability: 78.57% matched 
exactly, 14.29% differed by 1s, and 7.14% differed by 2s. 
When judgments differed, the later time was used. Coders 
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also quantified excitement by measuring smiling time (s) 
and assigning a subjective rating of apparent contented- 
ness (on 1-9 scale) at the onset of the waiting period (first 
30s). Additionally, the degree of physical movement (fidg- 
etiness) was measured via a computer script that quanti- 
fied the mean number of pixel changes across frames 
during the same 30-s time interval. 


3. Results 


Mean wait-times are shown in Fig. 1. Because the task 
was terminated at 15 min, children who had not eaten 
the marshmallow may have waited longer if the experi- 
mental design had permitted. Thus, this analysis is a con- 
servative estimate of the true difference between the two 
conditions. Children in the unreliable condition waited 
without eating the marshmallow for a mean duration of 
3 min and 2s (M=181.57s). In contrast, children in the 
reliable condition waited 12 min and 2s (M=722.43s). A 
Wilcoxon rank-sum test (also known as a Mann-Whitney 
Wilcoxon or a Mann-Whitney U) confirmed that this differ- 
ence was highly significant (W= 22.5, p<0.0005). Thus, 
children in the unreliable condition waited significantly 
less than those in the reliable condition. 
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Fig. 1. Mean wait-times of children in each condition. Error bars show 
95% confidence intervals. Children in the unreliable condition waited 
without eating the marshmallow for a mean duration of 3 min and 2s 
(M = 181.57 s). In contrast, those in the reliable condition waited 12 min 
and 2 s (M= 722.43 s). A Wilcoxon signed-rank test found this difference 
to be highly significant (W = 22.5, p < 0.0005). Here, 15 min was used as 
the wait-time for children who did not eat the marshmallow until the 
researcher returned, though these children may have actually waited 
longer if the experimental design had permitted. 


We also conducted a binary analysis of whether chil- 
dren waited the entire 15 min without tasting the marsh- 
mallow (Fig. 2). In the unreliable condition, only 1 out of 
the 14 children (7.1%) waited the full 15 min; in the reliable 
condition, however, 9 out of the 14 children (64.3%) 
waited. A two-sample test for equality of proportions with 
continuity correction at 02-tai)=0.05 (Newcombe, 1998) 
was highly significant (X? = 7.62, df=1, p< 0.006). Thus, 
children in the unreliable condition were significantly less 
likely to wait the full 15 min than those in the reliable 
condition. 

Additionally, we performed a linear regression with age 
and gender as predictors, controlling for condition. Neither 
factor—age (f$=8.57, t=1.29, p>0O.20) nor gender 
(B=—11.63, t=—0.10, p>0.92)—was significant in our 
sample. Detailed subject data appear in Appendix A.2. 

Since these results might alternatively be explained by 
a difference in mood across the two groups (e.g., by differ- 
ently induced levels of either frustration or excitement), an 
analysis of the three relevant measures—apparent content- 
edness, smiling, and fidgetiness—was conducted (see 
Appendix A.3). Results suggested that these variables did 
not vary systematically across the two conditions. 
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Fig. 2. Proportion of children who waited the full 15 min without eating 
the marshmallow by condition. Error bars show 95% confidence intervals. 
In the unreliable condition, only 1 out of the 14 children (7.1%) waited the 
full 15 min; in the reliable condition, however, 9 out of the 14 children 
(64.3%) waited. We tested the difference using a two-sample test for 
equality of proportions with continuity correction at 0%2-tai) = 0.05. The test 
found it to be highly significant (X? = 7.6222, df= 1, p< 0.006). 
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4. Discussion 


The results of our study indicate that young children’s 
performance on sustained delay-of-gratification tasks can 
be strongly influenced by rational decision-making pro- 
cesses. If self-control capacity differences were the primary 
causal mechanism implicated in children’s wait-times, 
then information about the reliability of the environment 
should not have affected them. If deficiencies in self-con- 
trol caused children to eat treats early, then one would ex- 
pect such deficiencies to be present in the reliable 
condition as well as in the unreliable condition. The effect 
we observed is consistent with converging evidence that 
young children are sensitive to uncertainty about future 
rewards (Fawcett et al., 2012; Mahrer, 1956; McGuire & 
Kable, 2012). 

The resulting effect of our experimental manipulation 
was quite robust (AM¢geiay = 9 min, p < 0.0005). Importantly, 
while there were small procedural differences between our 
study and past studies, children—age and gender-matched 
to the current study—who faced similar choices without 
prior explicit evidence of experimenter reliability waited for 
around 6 min (e.g., 6.08 min in Shoda et al. (1990)! and 
5.71 min in Mischel & Ebbesen (1970)7). When we manipu- 
lated experimenter reliability, children waited twice that 
long in the reliable condition (12.03 min), and half as long 
in the unreliable condition (3.02 min). While further work 
will be required to explicitly test the relative contributions 
of different factors, preliminary comparisons suggest that 
the influence of a child’s beliefs about the reliability of the 
world is at least comparable to their capacity for self- 
control.? 

To be clear, our data do not demonstrate that self-con- 
trol is irrelevant in explaining the variance in children’s 
wait-times on the original marshmallow task studies. They 
do, however, strongly indicate that it is premature to con- 
clude that most of the observed variance—and the longitu- 
dinal correlation between wait-times and later life 
outcomes—is due to differences in individuals’ self-control 
capacities. Rather, an unreliable worldview, in addition to 
self-control, may be causally related to later life outcomes, 
as already suggested by an existing body of evidence (e.g., 
Barnes & Farrell, 1992; Smyke, Dumitrescu, & Zeanah, 
2002). 


5. Conclusions 


We demonstrated that children’s sustained decisions to 
wait for a greater reward rather than quickly taking a les- 
ser reward are strongly influenced by the reliability of the 
environment (in this case, the reliability of the researcher’s 
verbal assurances). More broadly, we have shown that 
young children’s performance on delay-of-gratification 


1 Condition: exposed reward, no ideation instructions. 

? Condition: immediate reward. 

3 Two additional manipulation results from Shoda et al. (1990) that may 
inform relative effect-size estimates: (1) obscuring visual contact with the 
rewards during the wait (attention manipulation) increased mean wait- 
times by 3.75 min and (2) suggesting that children think about the larger 
reward (ideation strategy) increased them by 2.53 min. 


tasks can be strongly influenced by an implicit rational 
decision-making process. 
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Appendix A. Supplementary material 


A.1. Additional scripted dialogue 


The experimenter used the following scripted dialog to 
explain each stage of the testing procedure to young 
study participants. 


1. Onset of experiment - “So, today we have a very 
exciting art project planned for you! Upstairs, we 
have everything we'll need for you to make your own 
cup like this one! And you'll be able to take it home 
with you! Does that sound like something you'd like 
to do?” 


Art material choice - “To decorate your cup, you 
have a choice of what art supplies to use. You could 
use these [crayons] right now. Or—if you can wait for 
me to go get them from another room—you can use 
our big set of art supplies instead. The big set has 
markers, pens, colored pencils—a lot of cool stuff. 
How does that sound? [Response.] Okay, I’m going to 
go get the big set of art supplies from the other room. 
You should stay right here in that chair. Can you do 
that? [Response.] I'll leave these [crayons] right here, 
and if you haven’t used them when I come back, you 
can use our big set of art supplies instead!” 


Table A.1. Gender, age, and wait-times for each study participant. 
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3. Sticker choice - “Would you like to add a sticker to 
your picture? [Response.] For stickers, you have a 
choice. You can use this [sticker] right now. Or—if 
you can wait for me to go get them from the other 
room—you can have a bunch of stickers to use 
instead. How does that sound? [Response.] Okay, I’m 
going to go get more stickers from the other room. 
You should stay right here in that chair. Can you do 
that? [Response.] I'll leave this [sticker] here and if 
you haven't used it when I come back, you can have a 
bunch of stickers to use instead!” 


A.2. Detailed subject data 


Table A.1 contains the wait-time judgments of two 
video coders for each child participating in the study. 
Children were randomly assigned to one of two 
experimental conditions—unreliable and reliable—such 
that each group was gender and age balanced (nine 
males, five females, and M = 4;6). Two naive coders 
watched videos of the children waiting during the final 
stage of the testing procedure (the marshmallow task). 
The videos were blinded for condition. The coders 
measured each child’s wait-time until first taste (i-e., lick 
or bite). The coders’ timing judgments were checked 
against one another to ensure validity, and when timing 
judgments differed, the later judgment was used (and 
appears in bold in Table A.1). The judgments of the 





Wait-time (s) 


Condition Subject ID Gender Age Coder 1 Coder 2 Diff Waited 15 
1 m 37 17 1? 0 n 
3 f 4;0 21 19 2 n 
5 m 4;0 7 7 0 n 
7 m 4;0 15 15 0 n 
o f 4;0 10 10 0 n 
11 f 4;1 31 31 0 n 
Unreliable 13 m 4;4 496 498 -2 n 
15 f 455 900 900 0 y 
17 m 456 457 457 0 n 
19. m 4;10 72 73 -1 n 
21 m 5;3 18 18 0 n 
23 m 5;4 150 150 0 n 
25 f 5;7 195 195 0 n 
27 m 5;7 149 150 -1 n 
7.14% 














2 m 3;7 
4 m 3;8 
6 f 3;8 
8 f 4;0 
10 m 4;0 
12 f 41 
Reliable 14 m 4;6 
16 f 4:6 
18 m 4;7 
20 m 49 
22 m 4,11 
24 f 5;5 
26 m 5;9 
28 m 5;10 
9m, 5f M=4;6 


900 900 0 y 
785 785 0 n 
431 430 1 n 
900 900 0 y 
59 59 0 n 
144 145 -1 n 
900 900 0 y 
900 900 0 y 
900 900 0 y 
900 900 0 y 
594 594 0 n 
900 900 0 y 
900 900 0 y 
900 900 0 y 
M = 722.43 64.29% 
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Table A.2. Mood-variable means and statistical significance tests. 





Group means 
Unreliable (N = 14) 








Reliable (N = 14) 


Wilcoxon rank sum test Independent samples t-test 





Contentedness 
pane M =0.03 (sd =0.89) M=-0.03 (sd = 0.89) W= 106.5, p>0.71 t= 0.178, df= 26, p > 0.85 
ae M=3.16 (sd=3.68) M=4.45 (sd =6.53) W= 965, p> 0.96 t= 0.644, df= 26, p > 0.52 
Fidgeting 2 7 b a = = : e 
{interframe pelchange) _M= 051 (sd= 036) M=0.61 (sd = 0.39) W =97, p> 0.98 t= 0.000, df= 26, p = 1.00 





two coders were found to differ by at most 2 s on each 
wait-time. Children’s behavior was also coded in terms of 
a binary outcome measure corresponding to whether or 
not they waited the entire 15 min without tasting the 
marshmallow (as indicated in the “Waited 15” column in 
Table A.1). The percentages in this column reflect the 
portion of the group that waited the full 15 min: 7.14% in 
the unreliable condition and 64.29% in the reliable one. 


A.3. Analysis of mood variables 


We used three control variables to investigate the 
potential influence of mood on children’s wait times: 
contentedness, smiling, and fidgeting. Each measurement 
was based on a portion of each child’s video data—the 
first 30 s of the waiting period. 


1. Contentedness - Two naive coders rated each 
child’s apparent contentedness on a scale from 1-9, 
with 1 indicating very sad and 9 indicating very 
happy. We computed z-scores for each coder’s 
judgments, and then a mean z-score for each child. 


2. Smiling - Two naive coders measured for how long 
each child smiled (s). We use the mean of these two 
judgments. 


3. Fidgeting - A Python script automatically 
calculated an estimate of each child’s movement. 
The script computed the mean number of pixel 
changes frame-to-frame for each child, above a 
noise threshold (diff > 50). The threshold served to 
control for pixel changes caused by the noise 
inherent in digital frame-to-frame comparisons of 
this type (caused by, for example, small differences 
in compression and subtle lighting changes). Thus, 
the threshold enabled us to measure only changes 
caused by the body movements of each child. 


The mood-variable means for each group and the 
results of two types of statistical tests appear in Table 
A.2. Wilcoxon rank sum tests indicated that these 
variables did not significantly differ across conditions in 
our sample population. Independent samples t-tests 
(@2-tait = 0.05) also failed to detect significant differences 
across conditions. 


